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Abstract 

We consider the issue of performing accurate small-sample testing inference in 
beta regression models, which are useful for modeling continuous variates that 
assume values in (0,1), such as rates and proportions. We derive the Bartlett 
correction to the likelihood ratio test statistic and also consider a bootstrap 
Bartlett correction. Using Monte Carlo simulations we compare the finite sam¬ 
ple performances of the two corrected tests to that of the standard likelihood 
ratio test and also to its variant that employs Skovgaard’s adjustment; the lat¬ 
ter is already available in the literature. The numerical evidence favors the 
corrected tests we propose. We also present an empirical application. 
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1. Introduction 


Regression analysis is commonly used to model the relationship between a 
dependent variable (response) and a set of explanatory variables (covariates). 
The linear regression model is the most used regression model in empirical 
applications, but it is not appropriate when the variable of interest assume values 
in the standard unit interval, as is the case of rates and proportions. For these 
Ferrari and Cribari-Neto (|2004) proposed a regression model based on 


situations 


the assumption that the response (y) is beta distributed. Their model is similar 
to those that belong to the class of generalized linear models (McCullagh and 


Nelder, 1989). 


The beta density can be expressed as 




n^) 


T{yt4>)T{{l-p)d,y 


‘0-1(1-y)(i-M)0-i^ 0<y<l, (1) 
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and 


E(y)=//, var(y) = 

where V{fi) = ^ (1—/i) is the variance function and 0 can be viewed as a precision 
parameter. The beta distribution is very flexible since its density can assume 
different shapes depending on the values of the two parameters. In particular, 
it can be symmetric, asymmetric, J-shaped and inverted J-shaped; see Figure [l] 




(a) <f> = 10 


(b) 0 = 90 


Figure 1: Beta densities for different values of fi (indicated in the panels), with 0 = 10 (a) 
and Ip = 90 (b). 

The class of beta regression models allows practitioners to model responses 
that belong to the interval (0,1) using a regression structure that contains a 
link function, covariates and unknown parameters. Several authors have used 
beta regression models and alternative modeling strategies in different helds; see. 


e-g., 

Brehm and Gates 

(19931, 

Hancox et al. 

(20101 

Kieschnick and McCullough 

(2003) 

Smithson and Verkuilen 

(2006 

1 and 

Zucco 

(2008 

1. 


One may be tempted to view the logistic regression as an alternative to 
the class of beta regressions. However, logistic regression is used when the 
response is binary, i.e., y only assumes two values, namely: 0 and 1. In that 
case, one models Pr(y = 1) as a function of covariates. Beta regression, on the 
other hand, is used when the response is continuous and assume values in the 
standard unit interval. Beta regression is useful for modeling rates, proportions, 
concentration indices (e.g., Gini) and other variates that assume values in (0,1) 
or, more generally, in (a, 6), where a and b are known (a < b). 

Testing inference in beta regression is usually carried out using the likelihood 
ratio test. The test employs an approximate critical value which is obtained from 
the test statistic limiting null distribution (x^). It is thus an approximate test 
and size distortions are likely to take place in small samples. This happens 
because when the number of data points is not large the test statistic exact null 
distribution is oftentimes poorly approximated by its asymptotic counterpart. 
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Testing inference can be made more reliable by transforming the likelihood ratio 
statistic using a Bartlett correction (Bartlett, 19371. Such a correction depends 
on the log-likelihood cumulants and mixed cumulants up to fourth order. The 
derivation of a closed-form expression for the Bartlett correction factor in beta 
regressions can be quite cumbersome since the mean and precision parameters 
are not orthogonal, unlike generalized linear models. 

A useful approach to improve inferences in small samples, particularly when 
the Bartlett correction is analytically cumbersome, is Skovgaard’s adjustment 
(Skovgaard 2001). This adjustment is more straightforward than the Bartlett 
correction, only requiring second order log-likelihood derivatives. It does not 
require orthogonality between nuisance parameters and parameters of interest. 
Skovgaard’s adjustment for varying dispersion beta and inflated beta regressions 
were derived by Ferrari and Pinheiro (2011) and Pereira (2010), respectively. 


Ferrari and Cysneiros (2008) obtained a similar adjustment for exponential fam¬ 


ily nonlinear models. The numerical results presented by these autors reveal 
that the modihed likelihood ratio test obtained using Skovgaard’s proposal is 
less size distorted than the original likelihood ratio test when the sample size is 
small. 

A shortcoming of Skovgaard’s correction is that it does not improve the rate 
at which size distortions vanish, i.e., it does not yield asymptotic refinements. As 
noted earlier, however, Bartlett corrections are more difficult to obtain. They 
deliver asymptotic refinements and are usually derived using a general result 


given by Lawley (1956). An alternative is to use results in Cordeiro (1993) 


which are written matrix fashion. Another alternative for models in which the 
derivation of Bartlett correction is analytically cumbersome is the bootstrap 


Bartlett correction (Rocke 1989). Here, the Bartlett correction factor is deter¬ 


mined using bootstrap resampling (Efron 1979). 

Our main goal in this paper is to derive the Bartlett correction factor to 
the likelihood ratio test in the class of beta regressions. The derivation is quite 
cumbersome since in beta regressions the mean regression parameter vector is 
not orthogonal to the precision parameter. We were able to obtain, after exten¬ 
sive algebra, the Bartlett correction for fixed dispersion beta regressions. We 
also consider the bootstrap Bartlett correction, i.e., we numerically estimate the 
Bartlett correction factor. Finally, we perform extensive Monte Carlo simula¬ 
tions where we compare the finite sample behavior of Bartlett corrected tests 
(analytically and numerically) to that of the modified likelihood ratio test of 
Ferrari and Pinheiroj (2011). The numerical evidence favors the two Bartlett 


corrected tests, especially the bootstrap Bartlett corrected test. 

The paper unfolds as follows. Secti oni ntroduces the beta regression model 
proposed by Ferrari and Cribari-Neto (2004). In Section]^ we derive the Bartlett 
correction factor to the likelihood ratio test in hxed dispersion beta regressions. 
We also present the bootstrap Bartlett correction and the modified likelihood 
ratio statistics obtained by [Ferrari and Pinheiro (2011). Monte Carlo Simu¬ 
lation results are presented and discussed in Section Section presents an 
application that uses real (not simulated) data. Concluding remarks are offered 
in the last section and the log-likelihood cumulants we derived are presented in 
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Appendix A 


the 


2. The beta regression model 

Let y = (j/i,..., ynj^ be a vector of n independent random variables, each yi, 
i = 1,..., n, having density Q with mean fj,i and unknown parameter precision 
(j). The beta regression model can be written as 

p 

gil^i) = '^Xijl3j = (2) 

where /3 = {fix ,..., fipj^ is an unknown vector parameter and xn ,..., Xip are 
observations on the p covariates {p < n). When an intercept is included in the 
model, we have xn = 1, for i = l,... ,n. Finally, <?(•) is a strictly monotonic and 
twice differentiable link function, with domain in (0,1) and image in ]R. Some 
commonly used link functions are logit, probit, cloglog, loglog and Cauchy. 

Estimation of the fc-dimensional parameter vector 9 = (/3^, 4>)^, where k = 
(p+ 1), can be performed by maximum likelihood. The log-likelihood function 
is 

n 

= (3) 

where 


= ^OgT{(j))- \0gr{pi(j))- l0gr((l - pi)(j)) P {pt(j)- l)l0gy* 
+ {(1 - l}log(l - Vi). 


The score function U(9) is obtained by differentiating the log-likelihood func¬ 
tion with respect to unknown parameters. The score function with respect to /3 
and 4> are, respectively, 

U0{9) = cfX^Tiy* - p*), 

n 

2=1 


where X is the n x p covariates matrix whose i-th row is xj. Also, T = 
diag{l/5'(Aii),...,l/g'(/i„)}, y* = {y^,...,?/*}^, p* = {pi,..., pf^, ^ = 
log (t^^) ’ ~ 4’{9'i4‘) ~ '0((l ~ IJ-i)4‘) and tjj{-) is the digamma functior ’ 

The maximum likelihood estimators are the solution to the following system: 


r up{9) = 0 

I U49) = 0 


^The polygamma function is defined, for m = 0,1,..., as 

^d^i+i/dx"^"*"^) logr(fc), a: > 0. The digamma function is obtained by setting m = 0. 
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The maximum likelihood estimators, /3 and (j), cannot be expressed in closed- 
form. They are typically obtained by numerically maximizing the log-likelihood 
function using a Newton or quasi-Newtion nonlinear optimization algorithm. 


For details on nonlinear optimization algorithms, see Press et al. (1992). 
Fisher’s joint information for /3 and (j) is given by 


K = K{9) = 


^(/ 3 ,/ 3 ) 




where = (j)X WX, = (.K(p^p)) = X'Tc and = tr(T)). 

Here, W {n x n diagonal matrix), c (n-vector) and D (n x n diagonal matrix) 
have typical elements given hy = (j) {tp'ipicj)) + "fp '^ 

- /ii)<)')(l - /^*)}, d* = 

respectively. That is, W = diagjwi,..., iCrt}, c = (ci,...,Cn)^ and 


D = diagjdi,..., d„}. For details on log-likelihood derivatives, see Appendix A 


Under mild regularity conditions, and in large samples, the joint distribution 
of j3 and (j) is approximately fc-multivariate normal: 


/3 


' A4 




.K 


-1 


approximately. 

3. Improved likelihood ratio testing inference 

Consider the parametric model presented in (1^ and the corresponding log- 
likelihood function given in where 9 = {9j, 9^)' is the model A:-dimensional 
parametric vector, 9i being a g-dimensional vector and 02 containing the remain¬ 
ing k — q parameters. Suppose that we wish test the null hypothesis 

Uo- 01 = 0 ? 


against the alternative hypothesis 


'Hi-. 01 ^ 0 ?, 

where 0° is a given q x 1 vector of scalars. Hence, 02 is the vector of nuisance 
parameters and 0i is the vector of parameters of interest. The null hypothesis 
imposes q restrictions on the parameter vector. The likelihood ratio test statistic 
can be written as 

Li? = 2{f(0;y)-f(0;y)}, 

where the vector 0 is the restricted maximum likelihood estimator of 0 obtained 
by imposing the null hypothesis, i.e., 9 = {9^ , 02 )^. 

In large samples, the likelihood ratio statistic LR is approximately dis¬ 
tributed as Xq under Hq with error of the order n~^. In small samples, however, 
this approximation may be poor. Since the test is conducted using critical values 
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obtained from the limiting null distribution (Xq) that such a distribution 
may provide a poor approximation to the test statistic exact null distribution 
in small samples, the likelihood ratio test may be considerably size distorted. 

Likelihood ratio testing inference can be made more accurate by applying 
a correction factor to the test statistic. This correction factor is known as the 


Bartlett correction and was proposed by Bartlett (1937) and later generalized 
by Lawley ( 1956[ ). The underlying idea is to base inferences on the modified 
statistic given by LR/c, where c = Fi{LR)/q is the Bartlett correction factor. 
It is possible to express the Bartlett correction factor c using moments of log- 
likelihood derivatives; see Lawley (1956). It is noteworthy that the Bartlett 


correction delivers an improvement in the rate at which size distortions vanish; 

(1984). In particular, Pr(Li? < x) = Pr(xg < 


see 


Barndorff-Nielsen and Cox 


x) -i- U{n and Pi[LR/c < x) = L’r(xg < a;) -I- 0{n ^). 


3.1. A matrix formula for the Bartlett correction factor 
The Bartlett correction factor can be written as 


c = 1 -I- 


^k — q 

q 


Using Lawley’s expansion (Lawley 1956), the expected value of the likelihood 


ratio statistic can be expressed as 


Fi{LR) — q + Ck — Ck-q + 0{n ^), 


where 




X^rstu ^rstuvw)-: 


(4) 


\ — tix f (u) 

ZVrstu — '1^ ^ ^rst ' ^rt j i 

\ ^ f K , («)^ I f Ksvw (y)\ U) («) 

'^rstuvw — ^ ^ ^ ^ g ^sw J ' ^rtu ^ ^ ^sw J ' ^rt ^sw 

'^rt ^sw J 


and 


Krs = E 


f dH{e) \ 

yderdOs) 


( dH{e) \ 

ydOrde^det) 


r\jg^ e 


dKr 


etc. 


Notice that —k’'® is the (r, s) element of the inverse of Fisher’s information 
matrix, K~^. The summation in (|^ runs over all components of 6, i.e., the 
indices r, s, t, u, v and w vary over all k parameters. The expression for 
€k-q is obtained from (j^ by letting summation to only run over the nuisance 
parameters in 02 - All k’s are of order n, and Ck and Ck-q are of order 0{n~^). 

It can be quite hard to derive the Bartlett correction using Lawley’s general 
formula, since it involves the product of mixed cumulants that are not invariant 
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under index permutations (Cordeiro 1993). In particular, in the beta regression 


model the parameters /3 and cj) are not orthogonal, i.e., Fisher’s information ma¬ 
trix is not block diagonal, and as consequence the Bartlett correction derivation 
via the Lawley’s approach becomes especially cumbersome. An alternative is to 


use the general matrix formula given by Cordeiro (1993). 


In order to express in matrix form, we first define the following k x k 
matrices: and for t, u = 1,..., fc. The (r, s) elements of such 

matrices are 


^(t«) _ I 


^rstu 




+ 4r^}, = qW = {«(;)}, (5) 


for t,u = 1,... ,k. Using matrix notation, we can write 


'^Kstu =tTiK ^L), 
e 

= + iv{K-^M2) - tr(X-iM3), (6) 

o 

^ k ""' I ^Krtu^svu, - 

s'" ^ 

= + Xv{K-^N 2) - tr{K-^N^), ij) 

where the (r, s) elements of the L, Mi, M 2 , M 3 , A^i, N 2 and matrices are 
given, respectively, by 

{tr(A:-iA('-*))} , 

|tr(A:-ip('')A:-ip("))|, 

{tr(A:-ipMi^-iQ(*)^)} , 

|tr(pMA:-i)tr(p(®)if-^)|, 
{tr(pMA:-i)tr(Q(")A:-i)}, 
{tr(QWif-i)tr(Q(^)if-i)} . 


Using 0-0 we can write 

efe = tr [K-\L-M-N)\ , 
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where M = — IMi + M2 — M3 and N = —\Ni + N2 — N3. 


The term in (3.1) can be easily computed using a matrix programming lan¬ 
guage, like Ox ( Doornik| |2007| ) and R ( |R Development Core Team 2009). It 
only requires the computation of (fc -I- 1)^ matrices of order fc, namely: K~^, k 
matrices k matrices and k"^ matrices The remaining matrices 

can be obtained from them using simple matrix operations. Thus, to obtain the 
Bartlett correction factor c we need compute {k -f 1)^ matrices of order k and 
{k—q+iy matrices of order k — q. In order to obtain the matrices and 

we need cumulants of log-likelihood derivatives up to fourth order. We 
have derived these cumulants for the beta regression model and present them 
in [Appendix A[ 

The usual Bartlett corrected likelihood ratio statistic is given by LR/c. 
There are, however, two other equivalent specihcations that deliver the same 
order of accuracy. The three Bartlett corrected test statistics are 

LRbi = —, 

c 

Li?b 2 = Ti? exp I - I, 

(e/c Ck — q') 


PRbs — PR ^1 — 


q 


The corrected statistics LRbi, LRb 2 and LR^s are equivalent to order 0(n 


(Lemonte et ah, 2010), and Ti?t 2 has the advantage of only taking positive 


values. 


3.2. Bootstrap Bartlett correction 


Rocke (1989) introduced a numeric alternative to the analytic Bartlett cor¬ 


rection in which the correction factor is computed using Efron’s bootstrap 
(Efron 1979). The bootstrap Bartlett correction can be described as follow. 


Bootstrap resamples are used to estimate the likelihood ratio statistic expected 
value. Here, B bootstrap resamples ... ,y*^) are generated using the 

parametric bootstrap and imposing Rq. Data generation is performed from 
the postulated model after replacing the unknown parameter vector by its re¬ 
stricted estimate, i.e., by the estimate obtained under the null hypothesis. For 
each pseudo sample y*^, b = 1,2,..., B, the LR statistic is computed as 


LR*^ = 2{£{e*‘>-, y*‘>) - j/*")}, 

where 9*^ and 9*^ are the maximum likelihood estimators of 9 obtained from 
the maximization of £{9;y*^) under Hi and Ho, respectively. The bootstrap 
Bartlett corrected likelihood ratio statistic is then computed as 


LRhoot — 


LRq 

LR*' 




































where LR* = R-^ X;f=i LR*’’- 

It is noteworthy that the bootstrap Bartlett correction is computationally 
more efficient than the usual approach of using the bootstrap method to obtain a 
critical value (or a p-value) since it requires a smaller number of resamples. The 
usual bootstrap approach typically requires 1,000 bootstrap resamples, since it 


involves estimating tail quantities (Efron 1986 1987); on the other hand, the 


bootstrap Bartlett correction is expected to work well when based on only 200 
artificial samples. Notice that in the latter we use data resampling to estimate 
the mean of a distribution, and not an upper quantile. According to |Rocke| 
(1989) the bootstrap Bartlett correction that uses B = 100 typically yields 


inferences that are as accurate as those obtained using the usual bootstrapping 
scheme with B = 700. 


3.3. Skovgaard’s adjustment 


In a different approach, Skovgaard (2001) generalized the results in Skov- 


gaard (1996) and presented a much simpler way to improve likelihood ratio test¬ 


ing inference. His adjustment was later computed for various models; see, e.g.. 


Ferrari and Cysneiros 

to 

o 

o 

00 

), Ferrari and Pinheiro 

( 2011 ) 

Melo et al. 

(20091 

and 

Pereira 

(2010 

). The numerical evidence presented by these authors indicates 


that hypothesis testing inference based on Skovgaard’s modihed likelihood ratio 
statistic is typically more accurate than that based on the uncorrected statistic. 

In order to present the Skovgaard’s adjustment to the likelihood ratio test 
statistic, which was derived by Ferrari and Pinheiro (2011) for beta regressions. 


we shall now introduce some additional notation. Recall that 9 = {9j,9j)^, 
where 9i and 02 are the interest and nuisance parameters, respectively. Let J 
denote the observed information matrix and let Jn be the observed information 
matrix corresponding to 9i. Additionally, J = J{9), J = J{9), K — K{9), 
K = K{e) and tj = 17(0). 

The [Skovgaai^ modified likelihood ratio test statistic is given by 


LRski = LR - 2 log 


where 


{\k\ \k\ |Jii|}^/^ .j-^Tk-^uY/^ 

^ ~ |T| |{KT-iJiP-iT}ii|i/2 Li?-?/2-i;7TT-iu ■ 

Here, T and v are obtained by replacing 0 for 0 and 02 for 0 in T = Eg [17(0)17^(02)] 
and V = Eg[17(0)(l(0) — 7 (02))] after expected values are computed. 

An asymptotically equivalent version of the above test statistic is 

LRsk2 = ^ • 


Under "Hg, LRgki and Li?sfe 2 are approximately distributed as Xq with a 
high degree of accuracy (Skovgaard 2001 Ferrari and Pinheiro 2011 1 . For more 
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details and matrix formulas for T and v in the beta regressions, see Ferrari and 


Pinheiro (20111. In Ferrari and Pinheiro (2011) the Skovgaard adjustment is 


derived for a general class of beta regressions that allows for nonlinearities and 
varying dispersion. 


4. Numerical evidence 


This section presents Monte Carlo simulation results on the small sample 
performance of the likelihood ratio test {LR) in beta regression and also of six 
tests that are based on corrected statistics, namely: the three Bartlett corrected 
statistics {LRbi, LRb 2 and LR^s), the bootstrap Bartlett corrected statistic 
(LRhoot) and the two modified statistics obtained using Skovgaard’s adjustment 
{LRski and LRsk^)- The number of Monte Carlo replications is 10,000. For 
each Monte Carlo replication we performed 500 bootstrap replications. All 


simulations were carried out using the R programming language (R Development 


Core Team 20091. 


We consider the following beta regression the model: 


logit (/Tj) = Pi + P 2 X 2 i + PsXsi + Paxh + P^X^i. 

The covariates values are chosen as random draws from the 0.5, 0.5) distri¬ 
bution and are kept fixed during the simulations. We use four different values 
for the precision parameter p, namely: 100, 30, 10 and 5. Restrictions on P 
are tested using samples of 15, 20, 30 and 40 observations and at three nominal 
levels: a = 10%, 5% and 1%. The null hypotheses are Hq ■ P 2 = 0 {q = 1), 
"Ho : P 2 = Ps = 0 {q = 2) and Rq : P 2 = P 3 = Pi = 0 {q = 3), to be tested 
against two-sided alternative hypotheses. When g = 1, we set Pi = 1, P 2 = 0, 
Ps = 1, P 4 = 5 and P 5 = —4. When q = 2, Pi = I, P 2 = Ps = 0, P 4 = 5 and 
P^ = —4. Finally, when <7 = 3, the parameter values used for data generation 
are p-^^ = I, P 2 = P 3 = P 4 = 0 and P 5 = -4. 

Tables 0(9 = 1 ), i {q = 2) and 0 (g = 3) present the null rejection rates of 
the different tests. The figures in these tables clearly show that the likelihood 
ratio test is considerably oversized (liberal); its null rejection rate can be eight 
times larger than the nominal level, as in Table 0 for </, = 5, a = 1% and 
n = 15. In general, larger sample sizes and/or larger values of </> lead to smaller 
size distortions. 

The simulation results for g = 1 presented in Table 0 indicate that the 
corrected tests display good small sample behavior. The Bartlett corrected test 
LRbs is the best performer, being followed by the Skovgaard adjusted test LR^ki 
and by the bootstrap Bartlett corrected test, LRboot- The latter outperforms 
the competition when p = 30. For instance, when (/ = 30 and a = 10%, the null 
rejection rates of LR^z for the four sample sizes are 10.2%, 10.3%, 10.6% and 
10.0% and the corresponding rates of the LR^ki are 10.2%, 10.3%, 10.8% and 
10.2%. The good performance of the LR^z test can be observed in all scenarios. 

The results for the cases where we impose more than one restriction, namely 
g = 2 and g = 3, are presented in Tables 0 and 0 and are similar to those 
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Table 1: Null rejection rates (%) for the test of "Ho • /^2 = 0 (g = 1). 




a. = 

10 % 



Oi = 

5 % 



a = 



(!> 

n 

StaJ''"'— 

15 

20 

30 

40 

15 

20 

30 

40 

15 

20 

30 

40 


LH 

18.9 

16.5 

13.7 

12.8 

11.7 

9.5 

7.5 

7.2 

4.0 

3.0 

2.1 

1.7 


LRti 

12.4 

11.6 

10.6 

10.5 

6.9 

5.9 

5.4 

5.7 

1.6 

1.4 

1.0 

1.0 


LRb2 

11.5 

11.0 

10.3 

10.3 

6.2 

5.5 

5.3 

5.6 

1.4 

1.2 

1.0 

1.0 

too 

LRb3 

10.0 

10.0 

9.9 

10.1 

5.0 

4.9 

5.1 

5.4 

0.9 

1.0 

0.9 

1.0 



10.1 

10.0 

9.9 

10.3 

4.9 

4.9 

5.2 

5.4 

1.0 

1.0 

1.0 

1.0 


l^Rsk2 

11.8 

11.4 

11.0 

11.2 

6.3 

5.9 

6.1 

6.3 

1.8 

1.6 

1.6 

1.5 


^Rboot 

10.2 

10.1 

9.9 

10.4 

5.1 

5.0 

5.1 

5.5 

0.9 

1.0 

1.0 

1.0 


LH 

19.5 

16.8 

14.8 

13.0 

12.1 

9.7 

8.0 

7.0 

4.2 

2.7 

2.3 

1.6 


LRbi 

12.7 

11.8 

11.3 

10.5 

6.8 

6.1 

6.1 

5.2 

1.7 

1.3 

1.4 

1.1 


LRb2 

11.7 

11.2 

11.0 

10.2 

6.1 

5.6 

6.0 

5.1 

1.4 

1.1 

1.2 

1.1 

30 

LRb3 

10.2 

10.3 

10.6 

10.0 

5.1 

5.0 

5.7 

4.9 

1.1 

1.0 

1.1 

1.0 



10.2 

10.3 

10.8 

10.2 

5.2 

4.9 

5.7 

5.0 

1.2 

1.0 

1.2 

1.0 


l^Rsk2 

13.2 

11.7 

12.7 

11.7 

7.6 

6.2 

7.2 

6.2 

2.7 

1.7 

2.1 

2.0 


^Rboot 

10.2 

10.2 

10.6 

10.3 

4.9 

5.0 

5.6 

4.9 

1.1 

1.0 

1.2 

1.1 


~~nR 

22.0 

21.4 

17.9 

13.7 

14.4 

13.8 

11.0 

8.1 

5.5 

5.1 

3.6 

2.2 


LHb\ 

15.2 

15.9 

14.2 

11.4 

8.6 

9.1 

8.2 

6.3 

2.4 

2.5 

2.2 

1.4 


LRb2 

13.8 

15.1 

13.9 

11.2 

7.7 

8.5 

7.9 

6.2 

1.9 

2.2 

2.0 

1.3 

10 

LRb3 

12.0 

14.2 

13.5 

11.0 

6.3 

7.8 

7.6 

6.0 

1.5 

1.9 

1.9 

1.2 



12.1 

14.6 

13.9 

11.0 

6.4 

8.0 

7.8 

5.9 

1.5 

2.0 

2.0 

1.2 


LRsk2 

14.9 

17.3 

16.3 

13.0 

8.7 

10.2 

9.9 

7.6 

2.8 

3.5 

3.6 

2.6 


RRboot 

12.2 

14.6 

14.4 

12.7 

6.6 

8.2 

8.5 

7.2 

1.5 

2.1 

2.2 

1.9 


~nR 

19.1 

16.2 

15.4 

12.7 

12.2 

9.6 

8.7 

6.8 

4.3 

3.0 

2.5 

1.8 


LHb\ 

12.9 

11.5 

12.0 

10.8 

7.0 

6.2 

6.3 

5.3 

1.8 

1.3 

1.3 

1.2 


LRb2 

12.1 

11.0 

11.6 

10.7 

6.3 

5.8 

6.0 

5.2 

1.4 

1.2 

1.3 

1.1 

5 

LRb3 

10.6 

10.2 

11.3 

10.5 

5.3 

5.2 

5.8 

5.1 

0.9 

1.0 

1.2 

1.1 


RRskl 

11.7 

10.9 

11.5 

10.7 

6.2 

5.6 

6.1 

5.3 

1.3 

1.1 

1.2 

1.2 


i^Rsk2 

15.2 

14.1 

14.1 

13.1 

9.1 

8.1 

8.2 

7.3 

3.4 

2.8 

2.8 

2.8 


^Rboot 

13.9 

10.4 

11.7 

12.3 

7.8 

5.5 

6.2 

6.6 

2.5 

1.1 

1.4 

1.8 


obtained for q = 1. The modified tests once again displayed small size distor- 
tions. For instance, for g = 2, (/> = 30 and a = 5% the type I error frequency 
of the uncorrected likelihood ratio test equals 14.4% for n = 15 whereas for the 
corrected tests LR^s and LRtoot it equals 5.6%. The corresponding rejection 
rate of the LRski was 6.4%. For q = 3, (j) = 30, a = 5% and n = 15 the null 
rejection rates are 14.6% {LR), 5.0% (LR^s) and 5.0% {LR^oot)- For ^ = 30 
and a = 1%, the null rejection rates of the LRt,^, LRgki and LRi,oot tests are 
very close to 1.0% whereas, for the four samples sizes considered, the likelihood 
ratio test null rejection rates were 4.8%, 3.3%, 2.4% and 1.8%. 

The numerical results presented in Tables and show that the corrected 
tests outperform the uncorrected test in small samples. The best performing 
corrected tests are the Bartlett corrected test the bootstrap Bartlett 

corrected test LRi^oot and the Skovgaard test, LRgki- The null rejection rates of 
these tests are closer to the nominal levels than those of the uncorrected test and 
also relative to the other corrected tests. In particular, the bootstrap Bartlett 
correction works very well when ^ = 30 and (j) = 100. 

Table [^presents moments and quantiles of the different test statistics along¬ 
side with their asymptotic counterparts for g = 2, (^ = 30 and n = 20. It 
is noteworthy that the approximation to the likelihood ratio null distribu- 


11 












Table 2: Null rejection rates (%) for the test of 'Hq : 1^2 = Ps = 0 {g = 2). 




a. = 

10% 



Oi = 

5% 



Oi = 

1 ^ 


4> 

n 

StaJ''"'— 

15 

20 

30 

40 

15 

20 

30 

40 

15 

20 

30 

40 


LH 

22.0 

17.1 

14.1 

13.7 

14.1 

10.0 

7.8 

7.5 

4.9 

3.1 

2.2 

1.7 


LRbi 

13.3 

10.9 

10.1 

10.5 

7.3 

5.9 

5.1 

5.6 

1.6 

1.3 

1.2 

1.1 


LRb2 

12.1 

10.3 

9.7 

10.3 

6.4 

5.5 

5.0 

5.5 

1.3 

1.2 

1.2 

1.0 

100 

LRb3 

10.3 

9.5 

9.4 

10.1 

5.4 

4.8 

4.7 

5.4 

1.0 

1.0 

1.1 

1.0 



10.4 

9.6 

9.5 

10.1 

5.3 

4.9 

4.7 

5.4 

1.0 

1.0 

1.1 

1.0 


l^Rsk2 

11.5 

10.1 

9.7 

10.3 

6.1 

5.2 

4.9 

5.5 

1.2 

1.1 

1.2 

1.0 


^Rboot 

10.5 

9.6 

9.5 

10.2 

5.4 

5.0 

4.8 

5.4 

1.0 

1.0 

1.1 

1.0 


LH 

23.0 

17.8 

14.6 

13.8 

14.4 

10.6 

7.8 

7.6 

5.4 

3.3 

1.9 

1.9 


LRbi 

13.7 

11.7 

10.2 

10.7 

7.8 

6.0 

4.8 

5.4 

2.0 

1.4 

1.0 

1.0 


LRb2 

12.4 

10.9 

9.8 

10.5 

6.9 

5.6 

4.7 

5.3 

1.5 

1.2 

1.0 

1.0 

30 

LRb3 

10.7 

10.1 

9.4 

10.2 

5.6 

5.0 

4.5 

5.2 

1.1 

1.0 

1.0 

0.9 



11.2 

10.3 

9.6 

10.3 

6.4 

5.1 

4.5 

5.3 

1.8 

1.0 

1.0 

0.9 


l^Rsk2 

12.2 

10.9 

9.9 

10.5 

7.1 

5.4 

4.7 

5.4 

1.9 

1.2 

1.1 

1.0 


^Rboot 

10.5 

10.1 

9.5 

10.4 

5.6 

5.0 

4.6 

5.2 

1.1 

1.0 

1.0 

1.0 


~~nR 

26.0 

19.1 

16.0 

15.2 

17.4 

11.8 

9.1 

8.4 

7.0 

3.7 

2.7 

2.4 


LHb\ 

16.5 

12.7 

11.7 

12.0 

9.8 

6.7 

6.3 

6.2 

2.8 

1.6 

1.4 

1.4 


LRb2 

15.1 

12.0 

11.3 

11.8 

8.9 

6.3 

6.0 

6.0 

2.3 

1.4 

1.3 

1.4 

10 

LRb3 

13.2 

11.0 

10.9 

11.6 

7.4 

5.7 

5.6 

5.9 

1.8 

1.3 

1.2 

1.3 



13.4 

11.5 

11.0 

11.7 

7.5 

5.9 

5.7 

6.0 

1.8 

1.3 

1.3 

1.3 


LRsk2 

14.5 

12.2 

11.4 

12.1 

8.4 

6.4 

6.0 

6.3 

2.2 

1.5 

1.4 

1.5 


RRboot 

13.6 

11.0 

11.1 

12.8 

7.8 

5.6 

5.8 

6.8 

2.0 

1.2 

1.3 

1.7 


~nR 

27.8 

19.7 

15.3 

13.1 

19.3 

12.0 

8.5 

7.0 

8.0 

4.2 

2.4 

1.9 


LHb\ 

18.6 

13.1 

11.2 

10.1 

11.0 

7.1 

5.8 

5.5 

3.6 

1.8 

1.2 

1.2 


LRb2 

17.2 

12.4 

10.8 

10.0 

10.0 

6.5 

5.6 

5.4 

3.1 

1.7 

1.1 

1.1 

5 

LRb3 

14.9 

11.5 

10.5 

9.8 

8.4 

6.0 

5.4 

5.2 

2.3 

1.5 

1.0 

1.0 


RRskl 

14.4 

12.0 

11.2 

10.0 

7.9 

6.2 

5.6 

5.2 

2.2 

1.6 

1.1 

1.2 


i^Rsk2 

16.0 

12.8 

11.5 

10.4 

9.1 

6.7 

5.9 

5.6 

2.7 

1.8 

1.2 

1.3 


^Rboot 

15.4 

12.1 

11.0 

14.8 

8.9 

6.4 

5.8 

8.7 

2.6 

1.7 

1.2 

2.5 


tion is quite poor. For example, the limiting null distribution variance equals 4 
whereas the variance of LR exceeds 7. On the other hand, the same approxima¬ 
tion works quite well for the (analytically and numerically) Bartlett corrected 
statistics. The LRb^ statistic stands out, being followed by LR^oot- For instance, 
the mean and variance of LR^-i are, respectively, 1.9993 and 4.0729, which are 
very close to two and four, the X 2 mean and variance. The worst performing 
corrected statistic is Ti?s/c 2 j especially when we consider its skewness and kurto- 
sis. We also note that the limiting null approximation provided to the exact null 
distribution of LRgki is not as accurate as for the Bartlett corrected statistics 
and LRboot- This fact is evidenced by the measures of variance (4.2331), 
skewness (2.1816), kurtosis (11.5872) and by the 90th quantile (4.6612), which 
are considerably different from the respective chi-squared reference values. 

Figure contains QQ plots (exact empirical quantiles versus asymptotic 
quantiles) for different sample sizes when ^ = 10 and q = 1. Figure shows 
estimated null densities of some statistics for (j) = 5 and q = 3. These densities 
were estimated using the kernel method with Gaussian kernel function]^ In 


^For details on nonparametric density estimation, seelSilvermanl l|l986| and I Venables andl 

-- - 
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ire 3: Xq density (solid line) and estimated null densities of four test statistics 

q = 3. 









Table 3: Null rejection rates (%) for the test of "Ho : /32 = /^3 = /^4 = 0 (g = 3). 




a = 

10% 



a = 

5 % 



a = 

1 ^ 


(!> 

n 

StaJ''"'— 

15 

20 

30 

40 

15 

20 

30 

40 

15 

20 

30 

40 


LH 

22.3 

18.5 

15.5 

14.0 

14.4 

11.1 

8.7 

7.9 

5.1 

3.0 

2.4 

2.0 


LRbi 

13.2 

11.7 

11.2 

11.0 

7.4 

5.8 

5.5 

5.4 

1.8 

1.3 

1.2 

1.1 


LRb2 

12.0 

11.0 

11.0 

11.0 

6.7 

5.3 

5.4 

5.4 

1.5 

1.2 

1.2 

1.1 

too 

LRbz 

10.4 

10.2 

10.5 

10.7 

5.4 

4.8 

5.1 

5.3 

1.1 

1.1 

1.1 

1.1 



10.3 

10.2 

10.5 

10.7 

5.4 

4.8 

5.2 

5.2 

1.1 

1.0 

1.1 

1.0 


l^Rsk2 

11.2 

10.7 

10.8 

10.9 

6.2 

5.1 

5.3 

5.4 

1.3 

1.2 

1.1 

1.1 


LRboot 

10.2 

10.1 

10.6 

10.7 

5.4 

4.7 

5.2 

5.3 

1.1 

1.1 

1.1 

1.1 


LH 

23.0 

17.4 

14.6 

13.6 

14.6 

10.7 

8.2 

7.4 

4.8 

3.8 

2.4 

1.8 


LRbi 

13.1 

11.2 

10.3 

10.4 

7.0 

5.9 

5.5 

5.2 

1.8 

1.3 

1.1 

1.2 


LRb2 

11.9 

10.6 

10.0 

10.2 

6.1 

5.3 

5.3 

5.0 

1.4 

1.2 

1.1 

1.1 

30 

LRbz 

10.3 

9.8 

9.6 

10.0 

5.0 

4.8 

5.1 

5.0 

1.0 

1.0 

1.0 

1.1 



10.2 

9.9 

9.7 

10.1 

5.1 

4.8 

5.1 

5.1 

1.1 

1.0 

1.0 

1.1 


l^Rsk2 

10.2 

10.3 

9.9 

10.3 

5.7 

5.1 

5.2 

5.2 

1.3 

1.1 

1.0 

1.2 


LRboot 

10.2 

9.8 

9.5 

9.9 

5.0 

4.8 

5.1 

5.0 

1.0 

1.0 

1.0 

1.2 


~~nR 

22.1 

18.6 

15.3 

13.6 

13.7 

11.2 

8.7 

7.5 

4.6 

3.2 

2.2 

1.8 


LHb\ 

12.2 

11.7 

10.8 

10.3 

6.8 

6.2 

5.5 

5.3 

1.5 

1.2 

1.0 

1.0 


LRb2 

11.2 

11.2 

10.5 

10.1 

6.0 

5.7 

5.3 

5.2 

1.3 

1.1 

1.0 

1.0 

10 

LRbz 

9.8 

10.2 

10.1 

9.9 

5.0 

5.1 

5.1 

5.0 

0.9 

0.9 

0.9 

1.0 



10.3 

10.6 

10.4 

10.3 

5.1 

5.3 

5.2 

5.1 

1.0 

0.9 

1.0 

1.0 


LRsk2 

11.2 

11.1 

10.7 

10.4 

5.8 

5.7 

5.4 

5.2 

1.2 

1.0 

1.0 

1.0 


RRboot 

9.6 

10.1 

10.2 

10.0 

4.7 

5.1 

5.0 

5.0 

0.9 

0.9 

1.0 

1.0 


~nR 

21.5 

18.4 

15.0 

12.9 

13.6 

11.0 

8.3 

7.2 

4.4 

3.5 

2.3 

1.5 


LHb\ 

12.5 

11.6 

10.6 

9.9 

6.5 

6.0 

5.4 

5.1 

1.5 

1.4 

1.2 

0.8 


LRb2 

11.3 

11.0 

10.3 

9.8 

5.9 

5.5 

5.2 

5.1 

1.3 

1.3 

1.1 

0.8 

5 

LRbz 

9.7 

10.2 

10.0 

9.7 

4.8 

5.0 

5.0 

4.9 

1.0 

1.0 

1.0 

0.8 


RRskl 

10.1 

10.8 

10.5 

10.0 

5.1 

5.4 

5.4 

5.1 

1.0 

1.1 

1.2 

0.8 


i^Rsk2 

11.2 

11.3 

10.7 

10.3 

5.8 

5.7 

5.5 

5.3 

1.2 

1.3 

1.2 

0.9 


LRboot 

9.6 

10.2 

10.0 

9.6 

4.8 

5.0 

5.0 

5.1 

1.1 

1.1 

1.0 

0.8 


both figures we consider the likelihood ratio test statistic, the best performer 
Bartlett corrected statistic {LRh^), the bootstrap Bartlett corrected statistic 
and the best performer statistic modified using Skovgaard’s approach (LRski)- 
The QQ plots in Figure show that the corrected statistics null distributions 
are much more closer to the reference distribution than that of LR. The best 
agreement between exact and limiting null distributions takes place for LR},^. 
The same conclusion can be drawn from the estimated null densities presented 
in Figure]^ 

We have also used Monte Carlo simulation to estimate the tests nonnull re¬ 
jection rates, i.e., their powers. Tablej^presents such rates when data generation 
was carried out using ^2 = S {q = 1), (i2 = Pz = 5 {q = 2) and fi2 = Pz = fii = 5 
((7 = 3), for different values of 8 . We only considered the corrected tests LRb^, 
LRboot and LRgki- The likelihood ratio test is not included in the power com¬ 
parison because it is considerably oversized. Table contains the estimated 
powers of the three tests for different values of 8 . As expected, the tests become 
more powerful as <5 moves away from zero. We also note that the test based on 
LRski is slightly more powerful than the other two tests, especially when (5 > 0. 
When (5 < 0, LRbs is the most powerful test in some scenarios, e.g., when (j) = 5 
and (7 = 3, as well as when (p = IQ and (7 = 1. When 8 = —0.5, (7 = 1, <() = 10 
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Table 4: Estimated moments and quantiles of the different test statistics; q = 2, (p = SO and 
n = 20. 



Mean 

Variance 

Skewness 

Kurtosis 

9Uth-perc 

95th-perc 

99th-perc 

xi 

2.0000 

4.0000 

2.0000 

9.0000 

4.6052 

5.9915 

9.2103 

L% 

2.6741 

7.2829 

2.0784 

9.7003 

6.1775 

8.0134 

12.2788 

LRbi 

2.1353 

4.6449 

2.0799 

9.7146 

4.9319 

6.4065 

9.7992 

LRb2 

2.0777 

4.3982 

2.0804 

9.7182 

4.7979 

6.2333 

9.5338 

LRbs 

1.9993 

4.0729 

2.0810 

9.7243 

4.6147 

5.9960 

9.1731 

^Rskl 

2.0127 

4.2331 

2.1816 

11.5872 

4.6612 

6.0227 

9.2845 

LRsk2 

2.0906 

4.6776 

3.1049 

27.7738 

4.7836 

6.2003 

9.5926 

^Rboot 

2.0024 

4.1168 

2.1086 

9.9347 

4.6103 

5.9856 

9.2791 


Table 5: Nonnull rejection rates (%); n = 20 and a = 5%. 


4> 

StaT'"'''''''^^ 

-2.5 

-2.0 

-1.5 

-1.0 

-0.5 

0.5 

1.0 

1.5 

2.0 

2.5 

q = 1 



100 

100 

100 

99.5 

61.5 

61.9 

99.5 

100 

100 

100 

100 


100 

100 

100 

99.4 

60.8 

62.1 

99.5 

100 

100 

100 


LRboot 

100 

100 

100 

99.5 

61.6 

61.8 

99.5 

100 

100 

100 


I^Rbi 

100 

99.8 

96.5 

71.1 

24.8 

25.2 

71.8 

96.3 

99.8 

100 

30 


100 

99.8 

96.4 

70.7 

24.7 

25.4 

72.3 

96.4 

99.8 

100 


^^boot 

100 

99.8 

96.4 

71.1 

24.8 

25.1 

71.9 

96.1 

99.8 

100 


i^Rbi 

96.2 

85.3 

62.0 

33.3 

11.6 

12.5 

34.7 

63.5 

86.0 

96.2 

10 


96.0 

85.2 

61.7 

33.3 

11.5 

13.2 

36.0 

64.8 

86.8 

96.5 


^^boot 

96.3 

85.6 

62.4 

33.7 

12.0 

12.7 

34.9 

63.6 

85.8 

96.2 


^Rb3 

79.6 

61.2 

39.6 

20.4 

9.0 

9.6 

21.5 

41.4 

62.1 

81.2 

5 


79.8 

61.5 

39.8 

20.7 

9.2 

10.6 

23.2 

43.6 

64.4 

82.8 


^^boot 

80.4 

62.1 

40.6 

21.4 

9.6 

9.8 

21.8 

41.4 

62.1 

80.9 

5 = 2 



100 

100 

100 

100 

79.6 

80.4 

100 

100 

100 

100 

100 


100 

100 

100 

100 

79.2 

80.2 

100 

100 

100 

100 


LRboot 

100 

100 

100 

100 

79.2 

80.3 

100 

100 

100 

100 


I^Rbi 

100 

100 

99.7 

87.7 

S'2.'{ 

31.6 

88.1 

99.7 

100 

100 

30 


100 

99.9 

99.7 

88.0 

32.9 

31.4 

88.1 

99.7 

100 

100 


^Rboot 

100 

99.9 

99.7 

87.7 

32.8 

31.6 

88.0 

99.6 

100 

100 


l^Rb3 

99.7 

97.6 

82.2 

47.2 

15.3 

15.6 

47.5 

82.5 

97.4 

99.7 

10 


99.8 

97.8 

82.9 

47.8 

15.5 

15.8 

48.0 

82.9 

97.6 

99.7 


^Rboot 

99.7 

97.6 

82.3 

47.2 

15.3 

15.5 

47.7 

82.2 

97.3 

99.7 



95.7 

84.1 

60.6 

25.8 

11.6 

9.5 

26.6 

53.8 

79.3 

93.7 

5 


96.0 

84.8 

61.5 

25.6 

11.8 

10.4 

27.9 

56.1 

81.0 

94.7 


^Rboot 

95.8 

84.5 

61.4 

26.2 

11.9 

9.7 

26.2 

53.5 

78.8 

93.5 

q = -6 


^Rb3 

100 

100 

100 

100 

92.3 

91.3 

100 

100 

100 

100 

100 


100 

100 

100 

100 

92.2 

91.3 

100 

100 

100 

100 


LRboot 

100 

100 

100 

100 

92.2 

91.4 

100 

100 

100 

100 


I^Rbi 

100 

100 

100 

96.2 

41.9 

41.6 

95.1 

99.9 

100 

100 

30 


100 

100 

100 

96.0 

41.8 

41.5 

94.9 

100 

100 

100 


^Rboot 

99.9 

100 

100 

96.2 

42.2 

41.5 

95.0 

100 

100 

100 


i^Rb3 

100 

99.2 

91.5 

58.6 

17.8 

17.0 

56.9 

89.7 

98.5 

99.9 

10 


100 

99.2 

91.4 

58.3 

17.8 

17.3 

57.2 

89.7 

98.7 

99.9 


^Rboot 

100 

99.2 

91.4 

58.3 

17.7 

17.2 

56.9 

89.6 

98.5 

99.9 


^RbS 

98.0 

90.4 

69.4 

35.8 

11.3 

11.8 

35.6 

68.0 

88.7 

97.4 

5 


97.9 

90.2 

69.3 

35.7 

11.3 

12.3 

36.1 

68.5 

89.6 

97.7 


^Rboot 

97.9 

90.4 

69.6 

35.8 

11.2 

11.7 

35.3 

67.7 

88.4 

97.2 
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and 5, LR^oot outperforms the competition. 


5. An application 


This section contains an application of the corrected likelihood ratio tests 
using data from a random sample of 38 households in a large U.S. city; the source 
of the data is Griffiths et al. 1993, Tabela 15.4). Ferrari and Cribari-Neto (20041 
fitted a beta regression model to these data. The response {y) is the proportion 
of income spent on food and the covariates are income {X 2 ) and the number 
of persons in the household (xs). We also consider as candidate covariates the 
interaction between income and number of persons (x^ = X 2 x 0 : 3 ), income 
squared (x^ = x|) and the square of the number of persons in the household 
(xq = a;|). The beta regression model we fit is 


logit(^i) = /3i + f 32 X 2 i + fS^xz^ + Paxh + jS^x^i + Pex^i, ( 8 ) 

where i = 1,..., 38. 

At the outset, we wish to make inference on the significance of the interaction 
variable {x 4 ^), i.e., we wish to test Hq : = 0 against a two-sided alternative. 

The likelihood ratio test statistic (LR) equals 3.859 (p-value: 0.049) and the 
corrected test statistics are LRtz = 3.208 (p-value: 0.073) and LR^oot = 3.192 
(p-value: 0.074). These results show that inference is reversed when based on 
the corrected statistics. The likelihood ratio test rejects the null hypothesis at 
the 5% nominal level whereas the two corrected tests yield a different conclusion 
at the same nominal level. 

We then remove the interaction variable (0:4) from the model and estimate 
the following reduced model: 


logit(p,i) — Pi + ^2^21 + + Pb^Zi + Pz^Qi- 


The point estimates are (standard errors in parentheses): Pi = 0.4861 (0.5946), 
P^ = -0.0495 (0.0218), Pz = 0.0172 (0.1563), P 5 = 0.0003 (0.0002), pe = 0.0129 
(0.0198) and p = 39.296 (8.925). We now wish to test Rq : Pz = Pq = 0. The 
statistics are LR = 3.791 (p-value: 0.150), LR^z = 3.296 (p-value: 0.192) and 
LRboot = 3.210 (p-value: 0.201). The null hypothesis is not rejected by the 
three tests at the usual nominal levels. 

We thus arrive at the following reduced model: 


l 0 git(/ii) = Pi+ P 2 X 2 i + PzXzi- 


The point estimates (standard errors in parentheses) are pi = —0.6225 (0.224), 
P^ = -0.0123 (0.003), Pz = 0.1185 (0.035) and 0 = 35.61 (8.080). 

We now return to the Model Q and test the joint exclusion of the three 
regressors, i.e., we test Rq ■ P4 = Pz = Pe = 0. For this test we obtain LR = 
7.6501 (p-value: 0.054), LRbz = 6.554 (p-value: 0.088) and LRboot = 6.068 (p- 
value: 0.108). The p-value of the unmodified test is very close to 5% whereas 
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the p-values of the corrected tests indicate that the null rejection is not to be 
rejected at the 5% nominal level. It is noteworthy that the null hypothesis is 
not rejected by the bootstrap Bartlett corrected test at the 10% nominal level. 


6. Conclusions 


The class of beta regression models is commonly used when the interest lies 
in modeling the behavior of variables that assume values in the standard unit 
interval, such as rates and proportions. Testing inference is typically performed 
via the likelihood ratio test which is performed using asymptotic critical values, 
i.e., critical values obtained from the test statistic limiting null distribution. 
When the sample size is small, however, the approximation tends to be poor 
and size distortions take place. It is thus important to develop testing inference 
strategies that are more accurate when the sample size is not large. [Ferrari 


and Pinheiro (20111 derived two modified likelihood ratio test statistics for test¬ 


ing restrictions on beta regressions that typically yield more reliable inferences. 
They considered a very general class of models, which allows for nonlinearities 
and varying dispersion. In this paper, we derived three Bartlett corrected likeli¬ 
hood ratio test statistics for fixed dispersion beta regressions. The derivation is 


considerably more cumbersome than that of Ferrari and Pinheiro (2011), espe¬ 


cially because /? and (j) are not orthogonal. A clear advantage of our approach is 
that it delivers tests with higher order of accuracy. That is, the size distortions 
of our tests vanish faster than those of the unmodified likelihood ratio test and 


also than those of the modified tests proposed by Ferrari and Pinheiro (2011). 


We also considered a different approach in which bootstrap data resampling is 
used to estimate the Bartlett correction factor. We reported results of Monte 
Carlo simulations that show that the likelihood ratio test tends to be quite lib¬ 
eral (oversized) in small samples. The numerical evidence also shows that the 
corrected tests deliver much more accurate testing inference. In particular, one 
of the analytically derived Bartlett corrected tests {LRts) and the bootstrap 
Bartlett corrected test display superior finite sample behavior. We strongly 
encourage practitioners to base inference on such tests when performing beta 
regression analyses. 
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Appendix A. Cumulants for the Bartlett correction factor 


In this appendix we present the derivatives of the log-likelihood function in 
([^ up to the fourth order with respect to the unknown parameters and obtain 


their moments. Cumulants up to the third order can be found in Ospina et al. 

dMl ). 

At the outset, we define the following quantities: 
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Closed-form expressions for dujildcj), d^*/d(f) and d'^^,*/d(jP are given below. 
Additionally, we have 
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In particular, if the link function is logit, i.e.. 
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as in our numerical evaluation, it follows that 
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We also obtain the following derivatives: 
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Using the above expressions, the second, third and forth order derivatives of 
the log-likelihood function are 


i=l 


TT ST ) J.2 I dHi\ , , r ♦ I u am \ ufXi 

Urs = '^i-<l>C0d— 


dr]i 


d dni\ dfXi 


dfjLi drji J diji 


^ Xi'pXi 


= -^ [ct - {y* - y*i)]^Xir 
2 = 1 


U(p(p — ^ ^ di-) 


2=1 


/ dfii 


Urst — 0 ^ ^ ^ ^ j “1“ [y^ ] bi 


Urs4> - i > . + iy^ ) a„_. Ci ^ 


dVt 


d dfXi 


dfXi drji 


d dfii \ 1 dy 


dfXi drji J J drji 


Ur(l)(l) — ^ ^ y* t^ir J 

i=l 
n 

Ufjyfjyfj) = ^ ^ Si^ 

2=1 

"" f 

Urstu — 0 ^ ^ 0 


2=1 


mi 


(9 fdfiA^ drui (dfii 


djj^i \ dr]i J dfii \ dr}^ 


f dtti ^ 

1 doji 

y 

l^i + - CLi 

Ojli 


Urst<p — ^ ^ ^ ^ 


2 = 1 


(j) ( 3mt + (j) 


, * dbi \ dyi 

yVi l-^i ) o I J ^ir^is^it^iui 


drrii \ ( dyi 


d(j) J \dyi 


dyi J drji' 

I I doJi \ , dy* 


{vt - M-) } 


Xi'pXisXit 5 


21 



_ dvi djii 

^rsdxb — / 7^ 5 

“ ^fi^ dr]^ 
_ dsi d^i 


U, 


dsi 


4’4‘4’4' 


E Ut>i 

d(j> 


By taking the expected values of the above derivatives, we obtain the fol¬ 
lowing cumulants: 
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